43 research outputs found
Strong validity, consonance, and conformal prediction
Valid prediction of future observations is an important and challenging
problem. The two general approaches for quantifying uncertainty about the
future value employ prediction regions and predictive distribution,
respectively, with the latter usually considered to be more informative because
it performs other prediction-related tasks. Standard notions of validity focus
on the former, i.e., coverage probability bounds for prediction regions, but a
notion of validity relevant to the other prediction-related tasks performed by
the latter is lacking. In this paper, we present a new notion---strong
prediction validity---relevant to these more general prediction tasks. We show
that strong validity is connected to more familiar notions of coherence, and
argue that imprecise probability considerations are required in order to
achieve it. We go on to show that strong prediction validity can be achieved by
interpreting the conformal prediction output as the contour function of a
consonant plausibility function. We also offer an alternative characterization,
based on a new nonparametric inferential model construction, wherein the
appearance of consonance is more natural, and prove strong prediction validity.Comment: 34 pages, 3 figures, 2 tables. Comments welcome at
https://www.researchers.one/article/2020-01-1
Best Model Identification: A Rested Bandit Formulation
We introduce and analyze a best arm identification problem in the rested bandit setting, wherein arms are themselves learning algorithms whose expected losses decrease with the number of times the arm has been played. The shape of the expected loss functions is similar across arms, and is assumed to be available up to unknown parameters that have to be learned on the fly. We define a novel notion of regret for this problem, where we compare to the policy that always plays the arm having the smallest expected loss at the end of the game. We analyze an arm elimination algorithm whose regret vanishes as the time horizon increases. The actual rate of convergence depends in a detailed way on the postulated functional form of the expected losses. We complement our analysis with lower bounds, indicating strengths and limitations of the proposed solution
Meta-learning with Stochastic Linear Bandits
We investigate meta-learning procedures in the setting of stochastic linear
bandits tasks. The goal is to select a learning algorithm which works well on
average over a class of bandits tasks, that are sampled from a
task-distribution. Inspired by recent work on learning-to-learn linear
regression, we consider a class of bandit algorithms that implement a
regularized version of the well-known OFUL algorithm, where the regularization
is a square euclidean distance to a bias vector. We first study the benefit of
the biased OFUL algorithm in terms of regret minimization. We then propose two
strategies to estimate the bias within the learning-to-learn setting. We show
both theoretically and experimentally, that when the number of tasks grows and
the variance of the task-distribution is small, our strategies have a
significant advantage over learning the tasks in isolation
Regressão ordinal Bayesiana
Dissertação (mestrado)—Universidade de BrasÃlia, Instituto de Ciências Exatas, Departamento de EstatÃstica, 2013.Este trabalho apresenta a inferência do modelo de regressão ordinal, considerando
a ligação Logit e a abordagem da verossimilhança multinomial. Foi proposta uma
reparametrização do modelo de regressão. As inferências foram realizadas dentro de
um cenário bayesiano fazendo-se o uso das técnicas de MCMC (Markov Chain Monte
Carlo). São apresentadas estimativas pontuais dos parâmetros e seus respectivos intervalos
HPD, assim como um teste de significância genuinamente bayesiano FBST
(Full Bayesian Significance Test) para os parâmetros de regressão. A metodologia
adotada foi aplicada em dados simulados e ilustrada por um problema genético que
verificou a influência de um certo tipo de radiação na ocorrência de danos celulares.
A abordagem da verossimilhança multinomial combinada à reparametrização
do modelo é de fácil tratamento devido ao aumento da capacidade computacional e
do avanço dos métodos MCMC. Além disso, o FBST se mostrou um procedimento
simples e útil para testar a significância dos coeficientes de regressão, motivando
assim a utilização de uma abordagem bayesiana na modelagem de dados ordinais. _______________________________________________________________________________________ ABSTRACTThis work presents inferences of ordinal regression models considering the Logit
link functions and the multinomial likelihood approach. A new reparametrization
was proposed for the regression model. The inferences were performed in a bayesian
scenario, using the MCMC (Markov Chain Monte Carlo) technics. Point estimates
of the parameters and their respective HPD credibility intervals are presented, as
well a Full Bayesian Significance Test (FBST) for the regression parameters. This
methodology was applied on simulated data and illustrated in a genetic problem
which was to verify the inuence of certain radiation on the occurrence of cellular
damage. The multinomial likelihood approach combined with the model reparametrization
is easy to treat due the increasing computing power and the advancement
of MCMC methods. Moreover, the FBST proved being a simple and useful procedure
for testing the significance of regression coeficients, thus motivating the use of
a bayesian approach in ordinal data modeling
Estimate features relevance for groups of users
In item cold-start, collaborative filtering techniques cannot
be used directly since newly added items have no interactions with users.
Hence, content-based filtering is usually the only viable option left.
In this paper we propose a feature-based machine learning model that
addresses the item cold-start problem by jointly exploiting item content
features, past user preferences and interactions of similar users. The pro-
posed solution learns a relevance of each content feature referring to a
community of similar users. In our experiments, the proposed approach
outperforms classical content-based filtering on an enriched version of
the Netflix datase
Exploring the Semantic Gap for Movie Recommendations
In the last years, there has been much attention given to the semantic gap problem in multimedia retrieval systems. Much effort has been devoted to bridge this gap by building tools for the extraction of high-level, semantics-based features from multimedia content, as low-level features are not considered useful because they deal primarily with representing the perceived content rather than the semantics of it.
In this paper, we explore a different point of view by leveraging the gap between low-level and high-level features. We experiment with a recent approach for movie recommendation that extract low-level Mise-en-Scéne features from multimedia content and combine it with high-level features provided by the wisdom of the crowd.
To this end, we first performed an offline performance assessment by implementing a pure content-based recommender system with three different versions of the same algorithm, respectively based on (i) conventional movie attributes, (ii) mise-en-scene features, and (iii) a hybrid method that interleaves recommendations based on movie attributes and mise-en-scene features. In a second study, we designed an empirical study involving 100 subjects and collected data regarding the quality perceived by the users. Results from both studies show that the introduction of mise-en-scéne features in conjunction with traditional movie attributes improves both offline and online quality of recommendations
Genetic diversity in the env V1-V2 region of proviral quasispecies from long-term controller MHC-typed cynomolgus macaques infected with SHIVSF162P4cy
Intra-host evolution of human immunodeficiency virus (HIV) and simian immunodeficiency virus (SIV) has been shown by viral RNA analysis in subjects who naturally suppress plasma viremia to low levels, known as controllers. However, little is known about the variability of proviral DNA and the inter-relationships among contained systemic viremia, rate of reservoir reseeding and specific major histocompatibility complex (MHC) genotypes, in controllers. Here, we analysed the proviral DNA quasispecies of the env V1-V2 region, in PBMCs and in anatomical compartments of 13 long-term controller monkeys after 3.2 years of infection with simian/human immunodeficiency virus (SHIV)SF162P4cy. A considerable variation in the genetic diversity of proviral quasispecies was present among animals. Seven monkeys exhibited env V1-V2 proviral populations composed of both clusters of identical ancestral sequences and new variants, whereas the other six monkeys displayed relatively high env V1-V2 genetic diversity with a large proportion of diverse novel sequences. Our results demonstrate that in SHIVSF162P4cy-infected monkeys there exists a disparate pattern of intra-host viral diversity and that reseeding of the proviral reservoir occurs in some animals. Moreover, even though no particular association has been observed between MHC haplotypes and the long-term control of infection, a remarkably similar pattern of intra-host viral diversity and divergence was found within animals carrying the M3 haplotype. This suggests that in animals bearing the same MHC haplotype and infected with the same virus, viral diversity follows a similar pattern with similar outcomes and control of infection
Deriving Item Features Relevance from Past User Interactions
Item-based recommender systems suggest products based on the
similarities between items computed either from past user prefer-
ences (collaborative filtering) or from item content features (content-
based filtering). Collaborative filtering has been proven to outper-
form content-based filtering in a variety of scenarios. However, in
item cold-start, collaborative filtering cannot be used directly since
past user interactions are not available for the newly added items.
Hence, content-based filtering is usually the only viable option left.
In this paper we propose a novel feature-based machine learning
model that addresses the item cold-start problem by jointly exploit-
ing item content features and past user preferences. The model
learns the relevance of each content feature from the collaborative
item similarity, hence allowing to embed collaborative knowledge
into a purely content-based algorithm. In our experiments, the
proposed approach outperforms classical content-based filtering
on an enriched version of the Netflix dataset, showing that collabo-
rative knowledge can be effectively embedded into content-based
approaches and exploited in item cold-start recommendation